Data set:
breast cancer
proteomics by mass spectrometry
four cancer classes:
Goal:
Explore the data to identify patterns
Create models to predict breast cancer class
13 May, 2020
h2.title { font-size: 8px; #color: #a9a9a9; text-align: center; }
breast cancer
proteomics by mass spectrometry
four cancer classes:
Explore the data to identify patterns
Create models to predict breast cancer class
K-means clustering Acc.: 72.7% - ANN model Acc.: 82.8%
Collect more data for building more reliable models
Combine proteome data with RNAseq data to investigate more associations - network analysis
Tidyverse R package is a smart and elegant tool for data analysis and visualization